Continuous variation in computational morphology - the example of Swiss German
نویسنده
چکیده
Most work in natural language processing is geared towards written, standardized language varieties. This focus is generally justified on practical grounds of data availability and socio-economical relevance, but does not always reflect the linguistic reality of sub-standard varieties. In this paper, we aim at the computational description of the morphology of a language with continuous internal variation, as it is encountered in most dialect landscapes. The work presented here is applied to Swiss German dialects; these dialects are well documented through dialectological research and are among the most lively ones in Europe in terms of social acceptance and media exposure.
منابع مشابه
ArchiMob - A Corpus of Spoken Swiss German
Swiss dialects of German are, unlike most dialects of well standardised languages, widely used in everyday communication. Despite this fact, automatic processing of Swiss German is still a considerable challenge due to the fact that it is mostly a spoken variety rarely recorded and that it is subject to considerable regional variation. This paper presents a freely available general-purpose corp...
متن کاملBuilding an Application for Learning the Finger Alphabet of Swiss German Sign Language through Use of the Kinect
We developed an application for learning the finger alphabet of Swiss German Sign Language. It consists of a user interface and a recognition algorithm including the Kinect sensor. The official Kinect Software Development Kit (SDK) does not recognize fingertips. We extended it with an existing algorithm. Posted at the Zurich Open Repository and Archive, University of Zurich ZORA URL: https://do...
متن کاملMorphological analysis and lemmatization for Swiss German using weighted transducers
With written Swiss German becoming more popular in everyday use, it has become a target for text processing. The absence of a standard orthography and the variety of dialects, however, lead to a vast variation in different spellings which makes this task difficult. We built a system based on weighted transducers that recognizes over 90% of the tokens in certain texts. Weights ensure preferring ...
متن کاملTaxonomic significance of achene morphology in the genus Centaurea L. (Asteraceae)
Achene morphology of 49 taxa of the genus Centaurea L. was studied in terms of 19 different characteristics. On the basis of the variation in these features, some sections, such as sect. Cyanus with hairy hilum, were separated. Despite various differences, C. leuzeoides and C. gilanica were categorized in the section Psephelloideae, a section with lots of character variations within its species...
متن کاملA Nonlinear Grayscale Morphological and Unsupervised method for Human Facial Synthesis Based on an Example Image
Human facial generation of example image is used as a requirement for biometric applications for the purpose of identifying individuals. In this paper, face generation consists of three main steps. In the first step, detection of significant lines and edges of the example image are carried out using nonlinear grayscale morphology. Then, hair areas are identified from the face of sample. The fin...
متن کامل